Methods for altering the cleavage specificity of a type IIG restriction endonuclease

ABSTRACT

Methods are provided for altering the cleavage specificity of a Type IIG restriction endonuclease, the Type IIG restriction endonuclease being characterized by a cleavage domain adjacent to a methylase domain, the methylase domain located adjacent to a specificity domain. The method includes ligating DNA or protein sequences to form a fusion DNA or fusion protein. Where a fusion DNA is formed, the host cell is transformed with the fusion DNA to express a Type IIG restriction endonuclease with altered cleavage specificity.

CROSS REFERENCE

[0001] This application is a continuation-in-part of U.S. patent application Ser. No. 10/150,028 filed May 17, 2002, which is a divisional application of U.S. application Ser. No. 09/693,146 filed Oct. 20, 2000, now U.S. Pat. No. 6,413,758 issued Jul. 2, 2002, each of which is hereby incorporated by reference in its entirety.

BACKGROUND OF THE INVENTION

[0002] Type II restriction endonucleases are a class of enzymes that occur naturally in bacteria and in some viruses. When they are purified away from other bacterial proteins, restriction endonucleases can be used in the laboratory to cleave DNA molecules into small fragments for molecular cloning and gene characterization.

[0003] Restriction endonucleases act by recognizing and binding to particular sequences of nucleotides (the ‘recognition sequence’) along the DNA molecule. Once bound, they cleave the molecule within, to one side of, or to both sides of the recognition sequence. Different restriction endonucleases have affinity for different recognition sequences.

[0004] Bacteria that produce restriction endonucleases, protect their own DNA by methylating nucleotides at the cleavage site of the endonuclease. The coordinated production of a restriction endonuclease and a specific methylase is called a restriction-modification (R-M) system.

[0005] Methyltransferases are complementary to restriction endonucleases and they provide the means by which bacteria are able to protect their own DNA and distinguish it from foreign, infecting DNA. Modification methylases recognize and bind to the same recognition sequence as the corresponding restriction endonuclease, but instead of cleaving the DNA, they chemically modify one particular nucleotide within the sequence by the addition of a methyl group (C5-methyl cytosine, N4-methyl cytosine, or N6-methyl adenine). Following methylation, the recognition sequence is no longer cleaved by the cognate restriction endonuclease. The DNA of a bacterial cell is always fully modified by the activity of its modification methylase. It is therefore completely insensitive to the presence of the endogenous restriction endonuclease.

[0006] By means of recombinant DNA technology, it is now possible to clone genes and overproduce the enzymes in large quantities. Restriction endonucleases have highly specific recognition and cleavage sites. It would be desirable to expand the repertoire of restriction endonucleases by retaining a particular cleavage position while modifying the recognition site of an enzyme.

SUMMARY OF THE INVENTION

[0007] In a preferred embodiment, a method is provided for altering the cleavage specificity of a Type IIG restriction endonuclease where the Type IIG restriction endonuclease is characterized by a cleavage domain adjacent to a methylase domain, the methylase domain located adjacent to a specificity domain. The method includes ligating a first DNA sequence and a second DNA sequence to form a fusion DNA. The first DNA sequence includes a DNA segment encoding a catalytic domain and an N-terminal portion of a methylase domain of a first Type IIG restriction endonuclease. The second DNA sequence includes a DNA segment encoding a specificity domain and a C-terminal portion of a methylase domain of a second Type IIG restriction endonuclease, such that the ligation occurs between sequences encoding the methylase domain. A preparation of host cells are then transformed with the fusion DNA for expressing a Type IIG restriction endonuclease with altered cleavage specificity.

[0008] In an additional embodiment, the method described above further includes introducing a mutation into the cleavage domain to enhance the viability of the transformed host cell.

[0009] In an additional embodiment, a method described above further includes a sequence corresponding to the N-terminal portion of the methylase which terminates within a methylase conserved motif selected from motifs X, I, II, III, IV, V, VI, VII and VIII.

[0010] In an additional embodiment, a method described above further includes a sequence corresponding to the C-terminal portion of the methylase which terminates in a methylase conserved motif selected from motifs X, I, II, III, IV, V, VI, VII and VIII. In preferred embodiments, the N-terminal portion and the C-terminal portion of the methylase are non-overlapping.

[0011] In a particular example of the above, the sequence corresponding to the N-terminal portion of the methylase motif terminates between the amino acid sequence encoding motif III and amino acids-NPPY in motif IV.

[0012] In one embodiment, ligation occurs by means of a linker sequence attached to the N-terminal portion of the methylase domain and the C-terminal portion of the methylase domain on the first and second DNA segment.

[0013] In an embodiment, the fusion DNA encodes an active methylase domain.

[0014] In an embodiment, the first and second Type IIG endonucleases are endonucleases with defined cleavage and recognition sites or alternatively, the first Type IIG endonuclease is an endonuclease with defined cleavage and recognition sites and the second Type IIG endonuclease is characterized by a bioinformatic search of a microbial sequence database.

[0015] In one embodiment of the invention, a method is provided for forming a non-natural, functional Type IIG restriction endonuclease, wherein the Type IIG restriction endonuclease is characterized by a functional cleavage domain, a functional methylase domain and an altered functional specificity domain compared with a natural form of the functional Type IIG endonuclease. The method includes (a) inserting into a DNA encoding the methylase domain or the specificity domain of the natural form of the functional Type IIG endonuclease, a mutation or a nucleic acid linker sequence for inactivating optionally the cleavage domain; and inactivating (i) the functional methylase domain and the specificity domain or (ii) the functional methylase domain or the functional specificity domain; (b) ligating to the DNA at the mutation or at the linker, a DNA encoding (i) a portion of the methylase and specificity domain or (ii) a portion of the methylase or specificity domain to form a fusion DNA; and (c) transforming a host cell having a marker for detecting expression of a colony expressing a non-natural functional Type IIG restriction endonuclease.

[0016] For example, in the above method, the mutation may be positioned within a conserved motif in the methylase domain or the mutation may be a deletion at a 5′-end of the DNA encoding the specificity domain or the mutation is a deletion within the specificity domain. Alternatively, where a linker is utilized, the linker may be a transposon mediated linker insertion sequence. The linker may contain a restriction endonuclease cleavage site which is unique within the DNA encoding the restriction endonuclease.

BRIEF DESCRIPTION OF THE DRAWINGS

[0017]FIG. 1 shows a gene organization of BpmI restriction-modification system. Genes BpmIRM and BpmIM1 code for BpmI endonuclease (BpmI endonuclease-methylase fusion protein and BpmI M1, respectively. BpmI-Δ#1, BpmI-Δ#2, and BpmI-Δ#3 are deletion mutants with deletions in the methylation or specificity domains.

[0018]FIG. 2 shows a DNA sequence of BpmIM1 methylase gene (BpmIM1) (SEQ ID NO:1) and its encoded amino acid sequence (SEQ ID NO:2).

[0019]FIG. 3 shows a DNA sequence of BpmI endonuclease gene (BpmIRM) (SEQ ID NO:3) and its encoded amino acid sequence (SEQ ID NO:4).

[0020]FIG. 4 shows a recombinant BpmI endonuclease activity in column fractions following heparin Sepharose chromatography (Amersham Biosciences, Piscataway, N.J.). Lane 1: purified native BpmI endonuclease; lanes 2 to 23: heparin Sepharose column fractions. Fractions 11 to 14 gave rise to complete BpmI digestion of λ DNA. The remaining fractions contain no or partial BpmI activity. Lane 24: 1 kb DNA size marker.

[0021]FIG. 5 shows a functional domain organization of BpmI R-M fusion protein. The conserved amino acid motifs I and IV are indicated. The linear order of the conserved aa motifs in the γ-type methylase is motifs X-I-II-III-IV-V-VI-VII-VIII. The specificity domain is shared between the endonuclease and methylase activites. However, some amino acid residues may play roles in both specific binding (specificity determinant) and catalysis.

[0022]FIG. 6 shows a cleavage-deficient and methylation-proficient BpmI variants. Lane 1, protein size marker; lane 2, uncut plasmid DNA (pET28-BpmIRM); lanes 3-20, BpmI variants. Lanes 3, 5-9, 12-14, 17, 20, R⁻M⁺ variants. Note, the R-M fusion methylase (M1) only conferred partial resistance to the plasmid. Full resistance requires a second methylase, M.BpmI (M2).

[0023]FIG. 7 shows AcuI deletion variant D80A/D(520-1000) in SDS-PAGE gel in the presence and absence of IPTG. M, protein size marker (in kDa); lanes 1, 3, 5, 7, 9, 11, IPTG-induced cell extracts; lanes 2, 4, 6, 8, 10, 12, non-induced cell extracts. The predicted molecular weight of the deletion variant is 59 kDa.

[0024]FIG. 8 shows in vivo SOS induction assay (endo-blue assay) for the BpmI/BsgI chimeric enzyme on X-gal plates. Blue colony transformants were re-streaked on X-gal plates to test the stability of blue phenotype. First line streak, “\” corresponds to low activity BstYI mutants (blue colonies, positive control); second line streak, “-” corresponds to white colonies, vector control; third, fourth, and fifth streak lines, blue colonies of BpmI/BsgI chimeric clones. Sixteen clones remained blue while eight clones turn white or partial blue.

DETAILED DESCRIPTION OF THE EMBODIMENTS

[0025] The methods described herein address the problem of how to modify the specificity of any restriction endonuclease by genetic manipulation. A chimeric endonuclease is formed in which the specificity domain of a particular Type IIG restriction endonuclease is altered by substitution of part or all of the specificity and/or methylase domains with a complementary portion of the specificity and methylase domains from a second Type IIG restriction endonuclease or a methylase. The complementary portion may be selected from any Type IIG restriction endonuclease or a methylase identified in REBASE or identified by bioinformatic techniques described in U.S. Pat. No. 6,383,770.

[0026] The molecular architecture of Type IIG enzymes is composed of three functional domains, the catalytic, methylase, and specificity domains (R⁺-M⁺-S⁺) (FIG. 1). These domains are generally aligned from the N-terminal end to the C-terminal end in the order of catalytic domain, methylase domain and specificity domain. Whereas the methylase domain contains highly conserved regions (nine motifs have been identified), the catalytic domain and the specificity domain are generally highly variable between different Type IIG restriction endonucleases. Examples of Type IIG enzymes with established recognition and cleavage sites include BpmI, AcuI and BsgI.

[0027] Particular examples are provided below to demonstrate how functional chimeric endonucleases can be formed having altered specificity. The methods described are not intended to be limited to these examples but rather are applicable to any Type IIG restriction endonuclease characterized by a single specificity domain for both a methylase and catalytic domain. Accordingly, alteration of the specificity domain or switching specificity domains between different Type IIG restriction endonucleases or between a Type IIG endonuclease and a methylase results in alteration of the specificity domain of both the methylase and the restriction endonuclease domain in the target Type IIG endonuclease. In vivo assays to determine whether a functionally active restriction endonuclease with altered specificity has been successfully produced is described in Example 8 using an E. coli strain carrying the dinD:IacZ fusion (see U.S. Pat. No. 5,498,535 herein incorporated by reference). This assay permits 10,000 colonies to be screened on one plate with the positive colonies appearing blue and the negative colonies, white (FIG. 8). This in vivo assay avoids time-consuming analysis of individual transformed colonies.

[0028] The alteration of specificity of a Type IIG restriction endonucleases described herein results from any of two distinct approaches.

Linker Insertion

[0029] In one approach to the above problem, a nucleic acid linker is inserted into DNA encoding the methylase or specificity domain of one Type IIG restriction endonuclease. The linker may be sufficient in length to encode 3-12 amino acids. A DNA encoding a complementary portion of a second Type IIG restriction endonuclease or a portion of a complementary region of a second Type IIG restriction endonuclease or all or part of an independent y-type methylase (not derived from a Type IIG endonuclease) but containing a specificity region which is ligated to the linker. The chimeric DNA encodes a functional restriction endonuclease with altered specificity. Example 3 describes how a DNA linker coding for up to about 10 amino acids may be inserted between the coding region for a methylase and the coding region for the restriction endonuclease (in this example, BpmI) such that a second methylase region and a specificity region is added to the linker.

[0030] It will be clear to one of ordinary skill in the art, that any restriction endonuclease cleavage site that occurs only once in the DNA encoding the Type IIG endonuclease may be used as an insertion site for a linker and consequently for adding all or part of a complementary methylase/specificity domain. Examples of single cut sites in the BpmIRM gene are AfeI, BspHI, BclI, HindIII, and PacI. The linker insertion may inactivate the endonuclease catalytic activity (R⁻M⁺) or the methylase activity (R⁺M⁻) or both activities (R⁺M⁻) However, subsequent to ligation, the in vivo assay described in Example 8 can rapidly distinguish active from inactive transformed colonies.

[0031] One approach to facilitate linker insertion uses a drug resistant cassette flanked by convenient restriction sites. Following the introduction of the drug resistant cassette (for example, selection of Km^(R) colonies), the majority of the cassette is removed by restriction digestion and religation, leaving only 3 to 12 codons in-frame insertions in the gene.

[0032] Alternative to a random single cutting site or a specific cutting site for insertion of a linker, linkers may be randomly inserted using a transposon-mediated linker insertion system (GPS™-LS linker scanning system, New England Biolabs, Inc., Beverly, Mass.). This system generates insertion of, for example, 15 bp “linkers” (5 amino acids) at random positions throughout the R-M-S gene. The linker scanning mutagenesis is carried out by introduction of a transposon carrying a drug resistance. Following transposon insertion into the target gene carried on a plasmid and drug resistant selection, the majority of the transposon is removed by restriction digestion and ligation. Religation results in a 15 bp insertion (5 amino acids). Protein segments that are tolerant to linker insertion can be identified this way and a DNA segment encoding a novel binding specificity can be inserted afterwards in a manner similar to that described above. Again the in vivo assay described in Example 8 is a rapid method for screening thousands of colonies for endonuclease activity. Positive colonies include colonies having R⁺M⁻S⁺ or R⁻M*S⁺ DNA (M*indicates under methylation).

Non-linker Dependent Techniques

[0033] In an approach exemplified in Examples 5-8, mutations in one of the conserved motifs of the methylase domain of the target Type IIG restriction endonuclease are created which act as a site for exchange of DNA encoding the specificity domains and a part or all of the complementary portion of the methylase. These mutations can be introduced by PCR mutagenesis described in Example 6 but mutations may also be introduced according to other methods know to one of ordinary skill in the art. For example, U.S. patent application, Ser. No. 10/208,557 herein incorporated by reference describes an alternate method for creating mutations at target sites in a DNA. In Example 6, we show the construction of an AcuI deletion variant carrying deletion in the methylase domain and specificity domain for use in formation of chimeric proteins by exchange of domains with a second Type IIG restriction endonuclease or y-type methylase.

[0034] Example 7 describes how naturally occurring mutations can be utilized to generate a restriction endonuclease with altered specificity. In this example 6, ThaIVp is used for forming a chimeric enzyme with altered specificity. In Example 8, a chimera between BpmI and BsgI is described.

[0035] An alternative approach to the above non-linker techniques is to construct a restriction endonuclease variant library containing nested C-terminal amino acid deletions. The deletion can be carried out by, for example, nuclease digestion or Bal31 nuclease digestion from the C-terminal coding end. After nuclease treatment, the ends are filled-in by Klenow fragment and then ligated to a new DNA specificity (binding) domain. The library DNA carrying new DNA specificity domain is then screened by DNA binding assays or by in vivo SOS-induction assay, or other functional assays.

[0036] In any of the above approaches either utilizing linkers or relying on mutations, ligation of DNA segments from different sources can be achieved using any of the established techniques in the art such as ligase mediated ligation, ligation using single stranded ends. (see U.S. Pat. No. 6,660,475, US 2003-0194736 A1). In addition, ligation of proteins or peptides may be achieved using the intein mediated techniques described in U.S. Pat. No. 5,496,714 WO 00/18881 and WO 00/47751.

[0037] If the resulting chimeric protein that is produced form cells transformed with ligated chimeric DNA is insoluble due to aberrant folding and forms inclusion body, the protein can be refolded using various denaturing agents and refolded by slow dialysis into suitable buffer conditions.

[0038] All references cited herein are incorporated by reference.

EXAMPLES Example 1 Techniques

[0039] PCR, site-directed mutagenesis PCR procedure: PCR conditions are of 94° C. for 30 sec, 55° C. for 30 sec, 72° C. for 30 sec to 3 min for 13-25 cycles with 2 to 4 units of Vent® DNA polymerase in the presence of 2-10 mM MgCl2, dNTP, DNA template, and 1x Thermopol buffer. In some cases, the PCR products were purified from a low-melting agarose gel and treated with b-agarase. The DNA was precipitated with ethanol and salt. After vacuum drying, the DNA was resuspended in TE buffer and used for template for assembling PCR or for restriction digestion.

[0040] Plasmid DNA preparation procedure: Qiagen spin columns were used to prepare plasmid DNA. Cells lysis, protein and cellular DNA denatuation were performed with the addition of P1, P2, and N3 buffers. Clarified supernatant containing plasmid DNA was loaded onto Qiagen spin columns and washed with PB and PE buffers. Plasmid DNA was eluted with 10 mM Tris-HCl buffer.

[0041] Transformation procedure: Chemically competent cells were prepared by treatment of exponential phase E. coli cells with ice-cold 50 mM CaCl₂ for 30 min. Competent cells were mixed with plasmid DNA and incubated on ice for 30 min. After 3-5 min heat treatment at 37° C., an equal volume of LB broth was added. Cells were re-grown in a 37° C. incubator for one hours. Transformants were plated on LB agar plates with appropriate antibiotics for plasmid selection. X-gal was included in the plate for the incubation of “endo-blue” indicator strain.

[0042] Electroporation procedure: Electro-competent cells were prepared by washing E. coli exponential phase cells in 10% ice-cold glycerol twice (500 ml 10% glycerol for cell pellet from 1 L cell culture). After mixing the DNA with 100 ml of competent cells electroporation was carried out under the condition of 1900 V, 200 W, 25 mF, 0.1 cm cuvette. One ml of LB was added to cells and incubated for 1 hour to amplify the transformants. Transformants were plated on LB agar plates with appropriate antibiotics (Ap, Cm, or Km) for plasmid selection.

[0043] Preparation of cell extracts: Cells were cultured overnight in a 37° C. shaker, pelleted by low speed centrifugation (1800 g). Cells were resuspended in a sonication buffer (50 mM Tris-HCl, pH 7.8, 10 mM β-mercaptoethanol, 50 mM NaCl). Cell lysis was completed with sonication at output 4, 50%, discontinuous burst 5 times with a small sonication tip. The lysate was clarified by centrifugation at 14000 g at 4° C. for 10 min. The supernatant (cell extract) was used for the nicking enzyme assay.

[0044] DNA sequencing: An AmpliTaq™ dideoxy terminator sequencing kit (Applied Biosystems, Foster City, Calif.)was used in the sequencing reactions. DNA sequences, resolved on an automated sequencer ABI373A, were edited and analyzed using the Seqed program (Applied Biosystems, Foster City, Calif.) and a sequence analysis software package (Accelrys Inc., San Diego, Calif.).

Example 2 Cloning of BpmI Restriction-modification System in E. coli

[0045] 1. Preparation of genomic DNA and restriction digestion of genomic DNA.

[0046] Genomic DNA is prepared from Bacillus pumilus (New England Biolabs, Inc. collection #711, Beverly, Mass.) by the standard procedure consisting the following steps:

[0047] (a) cell lysis by addition of lysozyme (2 mg/ml final), sucrose (1% final), and 50 mM Tris-HCl, pH 8.0;

[0048] (b) cell lysis by addition of 10% SDS (final concentration 0.1%);

[0049] (c) cell lysis by addition of 1% Triton X-100 and 62 mM EDTA, 50 mM Tris-HCl, pH 8.0;

[0050] (d) phenol-CHCl₃ extraction of DNA 3 times (equal volume) and CHCl₃ extraction one time;

[0051] (e) DNA dialysis in 4 liters of TE buffer, change 3×; and

[0052] (f) RNA was removed by RNAse A treatment and the genomic DNA was precipitated in ethanol and resupended in TE buffer;

[0053] Five μg genomic DNA was digested partially with 2, 1, 0.5, and 0.25 units of ApoI (recognition sequence R/AATTY) at 50° C. for 30 min. Genomic DNA fragments in the range of 2-10 kb were purified through a 1% low-melting agarose gel. Genomic and pBR322 DNA were also digested with AatII, BamHI, ClaI, EagI, EcoRI, HindIII, NdeI, NheI, SalI, and SphI, respectively. Genomic DNA fragments were ligated to pBR322 with compatible ends.

[0054] 2. Construction of ApoI partial genomic DNA library and challenge of library with BpmI.

[0055] The ApoI partial DNA fragments were ligated to EcoRI digested and calf intestinal phosphatase (CIP) (New England Biolabs, Inc., Beverly, Mass.) treated pBR322 vector. The ligated DNA was dialyzed by drop dialysis on 4 L of distilled water and transferred into E. coli RR1 competent cells by electroporation. Ap^(R) transformants were pooled and amplified. Plasmid DNA was prepared from the overnight cells and challenged with BpmI. Following BpmI digestion, the challenged DNA was transformed into RR1 cells. Ap^(R) survivors were screened for resistance to BpmI digestion. A total of 36 plasmid mini-preparations were made. Two resistant clones, #18 and #26, were identified to be resistant to BpmI digestion. AatII, BamHI, ClaI, EagI, EcoRI, HindIII, NdeI, NheI, SalI, and SphI digested genomic DNA were also ligated to pBR322 with compatible ends and genomic DNA libraries were constructed. However, no apparent BpmI resistant clones were discovered from these libraries after screening more than 144 clones.

[0056] 3. Subcloning and DNA sequencing of the resistant clone.

[0057] One resistant clone #26 contains an insert of about 3.1 kb. The forward and reverse primers of pUC19 were used to sequence the insert. Three ApoI and one HindIII fragments were gel-purified and subcloned in pUC19 and sequenced. The rest of the insert was sequenced by primer walking. A methylase gene with high homology to amino-methyltransferase (N6-adenine methylase) was found within the insert which was named BpmIM1 gene. The BpmIM1 gene is 1,650 bp, encoding a 549-amino acid protein with predicted molecular mass of 63,702 daltons.

[0058] 4. Cloning of BpmI restriction endonuclease gene (BpmIRM) by inverse PCR.

[0059] There is one partial open reading frame upstream of BpmIM1 gene that has 31% amino acid sequence identity to another restriction enzyme Eco57I with similar recognition sequence (Eco57I recognition sequence: 5′CTGAAG N16/N14; (Janulaitis et al. Nucl. Acids Res. 20:6051-6056 (1992)); BpmI recognition sequence: 5′CTGGAG N16/N14 (see Rebase). Genomic DNA was digested with restriction enzymes AseI, BclI, HaeII, HpaII, MboI, MseI, NlaIII, PacI, and Tsp509I. The digested DNA was ligated at a low DNA concentration at 2 μg/ml and then used for inverse PCR amplification of BpmIR gene. The sequences of the inverse PCR primers were the following: 5′ gtggaaacggaccgtattatggtt 3′ (SEQ ID NO: 5) (232-34) 5′ caccagtaaataacaggttattcc 3′ (SEQ ID NO: 6) (232-35)

[0060] Inverse PCR conditions were 94° C. 1 min, 55° C. 1 min, 72° C. 2 min for 35 cycles. Inverse PCR products were derived from HaeIII and NlaIII templates, gel-purified from low-melting agarose and sequenced using primers 232-34 and 35.

[0061] The primers for the second round of inverse PCR were the following: 5′ ttcgtagcaagtacggtccatatcagt 3′ (SEQ ID NO: 7) (233-76) 5′ ccgtatgtacttgataggaataacctg 3′ (SEQ ID NO: 8) (233-77)

[0062] Genomic DNA was digested with AseI, BclI, BsrFI, BstNI, EcoRI, HincII, HindIII, HpaII, NcoI, PacI, PvuI, TaqI, TfiI, and XbaI. The digested DNA was ligated at a low DNA concentration at 2 μg/ml and then used for inverse PCR amplification of BpmIR gene. Inverse PCR conditions were 94° C. 1 min, 55° C. 1 min, 72° C. 2 min for 35 cycles. Inverse PCR products were derived from AseI, HindIII, HpaII, and TaqI templates, gel-purified from low-melting agarose and sequenced using primers 233-76 and 77.

[0063] The primers for the third round of inverse PCR were the following: 5′ aggaactaagaaagttcatagctg 3′ (SEQ ID NO: 9) (234-61) 5′ atgcggtattatataacccaacag 3′ (SEQ ID NO: 10) (234-62)

[0064] Genomic DNA was digested with AflIII, BspHI, BstNI, EcoRI, HaeII, HinP1I, HhaII, HindIII, StyI, and XmnI. The digested DNA was ligated at a low DNA concentration at 2 μg/ml and then used for inverse PCR amplification of BpmIR gene. Inverse PCR conditions were 94° C. 1 min, 55° C. 1 min, 72° C. 2 min for 35 cycles. Inverse PCR products were derived from HinP1I and XmnI templates, gel-purified from low-melting agarose and sequenced using primers 234-61 and 62.

[0065] The primers for the fourth round of inverse PCR were the following: 5′ tgacgtcctcttcacctaattcgg 3′ (SEQ ID NO: 11) (235-50) 5′ gagtttgtgaagatagaaccattg 3′ (SEQ ID NO: 12) (235-51)

[0066] Genomic DNA was digested with ApoI, BstBI, BstYI, ClaI, EcoRI, NdeI, RsaI, Sau3AI, SspI, TaqI, and XmnI. The digested DNA was ligated at a low DNA concentration at 2 μg/ml and then used for inverse PCR amplification of BpmIR gene. Inverse PCR conditions were 94° C. 1 min, 55° C. 1 min, 72° C. 2 min for 35 cycles. Inverse PCR products were derived from ApoI, ClaI, NdeI, RsaI, SspI, and TaqI templates, gel-purified from low-melting agarose and sequenced using primers 235-50 and 51. The ClaI fragment (2.4 kb) further extends upstream of BpmIRM gene. The rest of the ClaI fragment was sequenced using primer walking.

[0067] After four rounds of inverse PCR reactions, an open reading frame of 3,030 bp was found upstream of BpmI M1 methylase gene, which encodes a 1,009-amino acid protein with predicted molecular mass of 116,891 daltons. This is one of the largest restriction enzymes discovered so far. By amino acid sequence comparison of BpmI endonuclease with all known proteins in GenBank protein database, it was discovered that BpmI endonuclease is a fusion of two distinct elements with possible structural domains of restriction-methylation-specificity (R-M-S). This domain organization is analogous to the type I restriction-modification system with three distinct subunits, restriction, methylation, and specificity (R, M, and S). Because BpmI is distinct from other Type IIs restriction enzymes, it is proposed that BpmI belongs to a subgroup of Type II restriction enzymes called Type IIG.

[0068] 5. Expression of BpmIM1 gene in E. coli.

[0069] Two primers are synthesized to amplify BpmIM1 gene in PCR. The primer sequences are: forward: 5′ agcggatccggaggtaaataaatgaatcaat (SEQ ID NO: 13) taattgaaaatgttaat 3′ (238-177) reverse: 5′ aagggggcatgcttatacttatttcttcgtt (SEQ ID NO: 14) ctattgtttct 3′ (238-178)

[0070] Following digestion with BamHI and SphI, the PCR product was ligated into pACYC184 with the compatible ends. The ligated DNA was transformed into ER2566 competent cells. Cm^(R) transformants were plated at 37° C. overnight. Plasmids with BpmIM1 gene inserts were tested for resistance to BpmI digestion. Two out of 18 clones showed full resistance to BpmI digestion, indicating efficient BpmI M1 expression in E. coli cells and BpmI site modification on the expression plasmid. The host cell ER2566 [pACYC-BpmIM1] was used for expression of BpmIRM gene.

[0071] BpmI M1 methylase also modifies XhoI site. XhoI recognition sequence 5′CTCGAG3′ is similar to BpmI recognition sequence 5′CTGGAG3′ with only one base difference. It is concluded that BpmI M1 methylase may recognize the sequence 5′CTNNAG3′ and modify the adenine base to generate N6-adenine in the symmetric recognition sequence.

[0072] 6. Expression of BpmIRM gene in E. coli using a T7 expression vector.

[0073] Two primers were synthesized to amplify the BpmIRM gene. The primer sequences were: 5′ caaggatccggaggtaaataaatgcatataa (SEQ ID NO: 15) gtgagttagtagataaatac 3′ (247-217) 5′ ttaggatcctcatttttcttctcctaacgcc (SEQ ID NO: 16) gctgt 3′ (238-182)

[0074] The 3,030-bp BpmIRM gene was amplified in PCR using Taq DNA polymerase, digested with BamHI and ligated into BamHI-digested T7 expression vectors pAII17 and pET21a. After transformation of the ligated DNA into ER2566 [pACYC-BpmIM1], Ap^(R) Cm^(R) transformants were screened for the endonuclease gene insert. Seven out of 72 clones contained the insert with correct orientation. However, no BpmI activity was detected in cell extracts of IPTG-induced cells. This was probably due to mutations introduced during the PCR process.

[0075] To reduce the mutation frequency, Deep Vent® DNA polymerase was used in PCR reactions to amplify the 3,030-bp BpmIRM gene. The forward primer incorporated an XbaI site and its sequence is the following: 5′ caccaatctagaggaggtaaataaatg (SEQ ID NO: 17) catataagtgagttagtagataaatac 3′ (238-181)

[0076] PCR was performed using primers 238-181, 238-182, and Deep Vent® DNA polymerase. The PCR conditions were 94° C. 5 min for one cycle; 94° C. 1 min, 55° C. 1.5 min, 72° C. 8 min for 20 cycles. The PCR product was purified through a Qiagen spin column and digested with BamHI and XbaI and ligated to T7 expression vectors pAII17 and pET21at with compatible ends. Eighteen out of 36 clones contain the correct size insert. Ten ml cell culture for all 18 clones containing inserts were induced with IPTG for 3h and cell extracts were prepared by sonication and assayed for BpmI activity. Clone #4 displayed partial BpmI activity. Because this gene was derived by PCR cloning, the entire BpmIRM fusion gene was sequenced on both strands and it was confirmed to be wild-type sequence.

[0077] 7. Partial purification of recombinant BpmI activity.

[0078] Five hundred ml of cell culture was made for the expression clone #4 ER2566 [pACYC-BpmIM1, pET21at-BpmIRM]. The late log cells were induced with IPTG and cell extract (40 ml) containing BpmI was purified through a heparin Sepharose column. Proteins were eluted with a NaCl gradient of 50 mM to 1 M. Fractions 6 to 27 contained the most protein concentration and were assayed for BpmI activity on λ DNA. It was found that fractions 15 to 18 contained the most active BpmI activity (FIG. 4). The yield was estimated at 1,800 units of BpmI per gram of wet E. coli cells. The specific activity was estimated at 24,000 units per mg of protein. Proteins from fractions 15 to 18 were analyzed on a SDS-PAGE gel and protein bands were stained with Gelcode blue stain. A protein band corresponding to ˜115 kDa was detected on the protein gel, in close agreement with the predicted size of 117 kDa.

[0079] The E. coli strain ER2566 [pACYC-BpmIM1, pET21at-BpmIRM] has been deposited under the terms and conditions of the Budapest Treaty with the American Type Culture Collection on Jan. 22, 2001 and received Accession No. PTA-2598.

Example 3 Deletion of the Methylase Portion of BpmI RM Fusion Protein

[0080] Two primers were synthesized to amplify the putative endonuclease domain with deletion of the methylase and specificity domains. The deletion clone thus contains only the R portion and the M and S portions were removed. The forward primer was 238-181 as described above. The reverse primer had the following sequence with a XhoI site at the 5′ end: 5′ tgaaatctcgagttatcctgatccacaaca (SEQ ID NO: 18) tatatctgctat 3′ (244-95)

[0081] The deletion junction was in motif I of γ type N6 adenine methylase. The γ type N6 adenine methylases contain conserved motifs of X, I, II, III, IV, V, VI, VII, VIII. The specificity domain (TRD) is located after motif VIII. The BpmI deletion clone (BpmI-Δ#1) still carried motifs X and part of motif I. The specificity domain after motif VIII was also deleted (the remaining portion is shown in FIG. 1).

[0082] PCR was performed using primers 238-181 and 244-95 and Taq plus Vent® DNA polymerase (94° C. 1 min, 60° C. 1 min, and 72° C. 1 min for 25 cycles). The PCR product was digested with XbaI and XhoI and cloned into a T7 expression vector pET21b. Sixteen clones out of 36 screened contained the correct size insert and the cells were induced with IPTG for 3h. Cell extract was prepared by sonication and assayed for BpmI activity on λ DNA. However, no apparent BpmI digestion pattern was detected. Only non-specific nuclease was detected in cell extract, resulting in a smearing of DNA substrate. It was concluded that deletion of the methylase and specificity portion of the BpmIRM fusion protein abolished BpmI restriction activity.

[0083] To further confirm the above result, another deletion clone was constructed that deleted methylase motifs IV, V, VI, VII, VIII, and the specificity domain. This EcoRI fragment deletion mutant contains 1,521 bp (507 amino acid) deletion at the C-terminus half of the fusion protein (BpmI-Δ#2). IPTG-induced cell extract of this mutant also did not display BpmI endonuclease activity.

[0084] To delete the specificity domain referred to as the target-recognizing domain (TRD), a HindIII fragment of 579 bp (193 amino acid) was deleted from the C-terminus of BpmI RM fusion endonuclease (BpmI-Δ#3). IPTG-induced cell extract of the TRD deletion mutant did not show any BpmI endonuclease activity. However, the mutant protein displayed non-specific nuclease activity. It was concluded that the TRD is also required for BpmI endonuclease activity. Deletion of the TRD may abolish or reduce its DNA binding affinity and specificity. By swapping in other N6 methylase and specificity domains, new enzyme specificities can be created.

Example 4 Generation of New Enzyme Specificity Using BpmI RM Fusion Protein

[0085] Since BpmI endonuclease consists of three domains (R-M-S), it is possible to plug in other methylation-specificity domains to create a new enzyme specificity. The BpmIRM fusion gene is cloned in a T7 expression vector as described in Example 1. Plasmid DNA is prepared. The γ type N6 adenine methylases contain conserved motifs of X, I, II, III, IV, V, VI, VII, VIII (Malone T. et al. J.Mol.Biol 253:618-632 (1995)). Motifs X through VIII and TRD are deleted and a DNA linker coding for one, 3, 5, 7 and 10 bridging amino acids are inserted with a restriction site, preferably a blunt end restriction site, for example, the SmaI site. The length of the DNA linker is sufficient to provide steric space for the introduction of the new M-S domains. DNA coding for other γ type N6 adenine methylases containing motifs of X, I, II, III, IV, V, VI, VII, VIII and TRD are ligated to the digested blunt site (in frame) of the BpmI deletion clone. The ligated DNA is transformed into a non-T7 expression vector. After the insert is verified, the plasmid containing new methylation-specificity domains is transformed into a T7 expression host and induced with IPTG. Cell extract is assayed on plasmid and phage DNA and analyzed for new restriction activity.

Example 5 Construction of a Cleavage Deficient Variant of BpmI

[0086] Modification of the specificity of BpmI, AcuI, and BsgI which are all Type IIG restriction enzymes that display both endonuclease and methyltransferase (R-M) activity is achieved as follows:

[0087] Optional mutation of the restriction endonuclease was carried out to increase viability of cells transformed with an enzyme having altered specificity.

[0088] A two-step PCR mutagenesis was carried out to mutate the Asp74 codon (Asp74 to Ala74) in the catalytic domain. The PCR primers have the following sequences: PCR reaction 1: Forward primer (P1): 5′ CACCAATCTAGAGGAGGTAAATAAATGCATA (SEQ ID NO: 19) TAAGTGAGTTAGTAGATA AATAG 3′ (TCTAGA, XbaI site) Reverse mutagenic primer (P2): 5′ GTTTATACGAAGTGTATAAGCTGGATTTTTC (SEQ ID NO: 20) TTTGTCTG 3′ PCR reaction 2: Forward mutagenic primer (P3): 5′ GAGACAAAGAAAAATCCAGCTTATACACTTC (SEQ ID NO: 21) GTATAAAC 3′ Reverse primer (P4): 5′ TTAGGATCCTCATTTTTCTTCTCCTAACGCC (SEQ ID NO: 22) GCTGT 3′ (GGATCC, BamHI site)

[0089] The N-terminal 300 bp coding sequence was amplified in PCR reaction 1 with the following PCR conditions: 94° C. for 5 min, 1 cycle; 94° C. for 30 sec, 55° C. for 30 sec, 72° C. for 30 sec for 20 cycles; 72° C., 7 min for 1 cycle, 4 units of Vent® DNA polymerase. The rest of the coding sequence was amplified in PCR reaction 2 with the following PCR condition: 94° C. for 5 min, 1 cycle; 94° C. for 30 sec, 55° C. for 30 sec, 72° C. for 1 min for 20 cycles; 72° C., 7 min for 1 cycle, 4 units of Vent® DNA polymerase. PCR products 1 and 2 were purified from a low-melting agarose gel and used as the template for PCR assembly using primers P1 and P4. The assembly PCR conditions were 94° C. for 5 min, 1 cycle; 94° C. for 30 sec, 55° C. for 30 sec, 72° C. for 3 min for 20 cycles; 72° C., 7 min for 1 cycle. The mutagenized PCR product was purified and digested with XbaI and BamHI and cloned into a T7 expression vector pET28a. The phenotype of the resulting BpmI variant should be R⁻ (cleavage deficient) and M⁺ (methylation proficient). After screening 18 plasmids for PCR insert and digestion with BpmI endonuclease, 11 clones were found to be resistant to BpmI digestion (data shown in FIG. 2). Endonuclease activity was not detected in any of the mutant extracts prepared from IPTG-induced cells. Because multiple rounds of PCR were performed to generate the R⁻M⁺ variant D74A, it was necessary to re-sequence the entire gene to confirm that no other mutations were introduced. Six sequencing primers were used to sequence the entire gene. R⁻M⁺ variant D74A clone #4 carried one additional amino acid change at E1007G. In a separate experiment, it was determined that E1007G substitution was not important to BpmI endonuclease activity. The Asp74 to Ala74 substitution abolished BpmI endonuclease activity.

[0090] Using the same PCR mutagenesis strategy, another BpmI R⁻M⁺ variant E88A was constructed. It was confirmed that E88A is deficient in endonuclease activity and proficient in methylase activity. Both D74A and E88A mutants are useful as the recipient for exchange of a new specificity domain to generate novel enzymes. After a new specificity is confirmed by DNA binding assays, the mutated residues D74A or E88A are changed back to the wild-type residue Asp74 or Glu88 to restore the endonuclease activity. A non-cognate methylase can be used to protect the host DNA.

Example 6 Construction of a Deletion Mutant Carrying Deletions in the Methylase Domain and Specificity Domain

[0091] The methylase domains of BpmI and AcuI belong to the g type N6 adenine methylases. Motif IV is a conserved methylase block and has a GNPPY sequence in both of the BpmI and AcuI methylase domains. This site was chosen as a fusion junction for making chimeric enzymes. An AcuI deletion mutant was constructed that deleted methylase motif IV and the remaining C-terminal coding sequence. The starting AcuI enzyme was a cleavage-deficient variant D80A (R-M+mutant). The codon Phe520 was mutated to a stop codon by PCR mutagenesis to generate variant AcuI D80A/Δ(520-1000). The deletion mutant protein was expressed in E. coli ER2566 via T7 expression vector pET28a. When the cells were induced with IPTG (3 hours induction at 37° C.), a prominent protein band of 59 kDa was detected in SDS-PAGE gel (data shown in FIG. 7). The deletion mutant AcuI D80A/Δ (520-1000) is soluble in E. coli cell extract and not degraded by E. coli proteases. This deletion mutant can be used as the backbone to construct chimeric Type IIG enzymes. DNA coding for similar methylase motifs IV to VIII and an alternate specificity determinant can be ligated to this deletion mutant to construct a functional chimeric enzyme.

[0092] The coding sequences are ligated together by T4 DNA ligase (blunt end or sticky end ligation). In an alternative embodiment, the two coding sequences are assembled together by a two-step PCR method as described in Example 7. After the new specificity is determined by DNA binding assays, the catalytic residue Asp80 is restored. Although not required always, a non-cognate methylase is used here for protection of the host cell after transformation with the fusion DNA encoding the chimeric protein.

Example 7 Expression of a Natural Deletion Mutant of ThaIVp, a Truncated Type IIG Enzyme

[0093] ThaIVp is derived from a thermophilic bacterium. The ThaIVp coding sequence was amplified from the genomic DNA of Thermoplasma acidophilum (ATCC #25905) by PCR. The PCR primers have the following sequences: Forward: 5′ GGTGGTTCTAGAGGAGGTAAATAAATGTCTA (SEQ ID NO: 23) ATGAAAATTATAACATTGATTTC 3′ (TCTAGA, XbaI site) (293-282) Reverse: 5′ GGTGGTGAGCTCCTATTGACATAATCGATCA (SEQ ID NO: 24) TCAAGAAG 3′ (GAGCTC SacI site) (293-283)

[0094] The PCR components are as following: 293-282, 293-283 (0.8 mM), 4 units of Vent® DNA polymerase, Thermoplasma acidophilum (ATCC # 25905) genomic DNA (1 mg), dNTP (4 mM), 1x thermophilic polymerase buffer, H₂O73 μl, and MgSO₄ at 2, 4, and 6 mM. The PCR conditions are 95° C. 5 min for 1 cycle; followed by 95° C. 30 sec, 57° C. 30 sec, 72° C. 3 min for 30 cycles. The PCR product was digested by XbaI and SacI and ligated to pET-28a with compatible ends and transferred to T7 expression host ER2566 by transformation. There is a natural stop codon at the end of ThaIVp before the conserved methylase motif IV. Therefore the natural deletion mutant of ThaIVp is used as the recipient backbone for generating chimeric Type IIG enzymes by addition (ligation) of motif IV and the remaining methylase motifs and the specificity domain. This DNA segment can be ligated to DNA encoding thermophilic methylase domains (IV to X) and a specificity domain to produce a thermostable chimeric enzyme.

Example 8 Construction of a Chimeric Enzyme Between BpmI and BsgI

[0095] The DNA recognition sequences for BpmI and BsgI are CTGGAG and GTGCAG, respectively. BsgI endonuclease is a Type IIG enzyme that shares 35.4% amino acid sequence identity to BpmI. A chimeric enzyme was constructed between BpmI and BsgI, in which the N-terminal coding sequence (catalytic domain plus methylase motifs I to III) was derived from BpmI and the C-terminal coding sequence (methylase motifs IV to X and the specificity domain) was derived from BsgI. The chimeric coding sequence was generated by a two-step PCR reaction. PCR primers were designed that can anneal to methylase motif IV on both BpmI and BsgI templates. The amino acid sequences in the fusion junction are shown below: BpmI F D A I I G N P P Y BsgI F D V I L G N P P Y

[0096] The forward primer P1 described in Example 5 and a new reverse primer P2′ were used to amplify the N-terminal coding sequence from BpmIRM gene.

[0097] The new reverse mutagenic primer P2′ has the following sequence: 5′ ATAGGGTGGATTGCCTAATATTACATCAAAG (SEQ ID NO: 25) CCACCATTTGC 3′ (P2′).

[0098] PCR conditions were 94° C. for 5 min, 1 cycle; 94° C. for 30 sec, 55° C. for 30 sec, 72° C. for 2 min for 17-22 cycles; 72° C., 7 min for 1 cycle with 4 units of Vent® DNA polymerase.

[0099] The forward mutagenic primer in the fusion junction has the following sequence: 5′ TTTGATGTAATATTAGGCAATCCACCCTATA (SEQ ID NO: 26) TAAGAATTC 3′ (P3′)

[0100] Since the BsgIRM gene was cloned in pUC19, primer P3′ and a pUC universal primer NEB #1221 was used to amplify the C-terminal BsgI coding sequence. PCR conditions were 94° C. for 5 min, 1 cycle; 94° C. for 30 sec, 55° C. for 30 sec, 72° C. for 2 min for 15-22 cycles; 72° C., 7 min for 1 cycle with 4 units of Vent® DNA polymerase. The PCR products were purified from a low-melting agarose gel and assembled by PCR using primers P1′ and pUC universal primer #1221 (New England Biolabs, Inc., Beverly, Mass.). The PCR conditions were 94° C. for 5 min, 1 cycle; 94° C. for 30 sec, 55° C. for 30 sec, 72° C. for 3 min 10 sec for 15 cycles; 72° C., 7 min for 1 cycle with 4 units of Vent® DNA polymerase. The PCR DNA fragment was cloned into pET21at and transformed into T7 expression host ER2566. E. coli host with pACYC-BpmIM or pACYC-BsgIM was also used for transformation. The fusion junction was confirmed by DNA sequencing.

[0101]E. coli strain ER1992 carries the dinD::IacZ fusion (the dinD DNA damage inducible promoter is fused to the lacZ gene). When bacterial DNA is damaged by double-stranded cuts or single-stranded nicks, UV radiation, or interference with DNA replication, the indicator strain forms blue colony on X-gal plates. When plasmids carrying the chimeric BpmI/BsgI R-M fusion were transformed into the endo-blue indicator strain ER1992 (dinD::IacZ), they caused formation of blue colonies in the absence of IPTG induction. This indicates transformants suffered DNA damage resulting from constitutive expression of the fusion protein (data shown in FIG. 4). The transformants initially formed blue colonies on X-gal plates. When these cells were plated on X-gal plates with IPTG, most cells turn white. After IPTG induction, cells suffered lethal level of DNA damage and died. The cells carrying inactive mutant version of the chimeric R-M fusion protein took over the population and formed the white colonies. When the blue transformants were re-streaked on X-gal plates, about two-third remains blue colonies, and one-third form white or partial blue colonies.

1 26 1 1650 DNA Bacillus pumilus CDS (1)..(1650) 1 atg aat caa tta att gaa aat gtt aat cta caa aaa tta agg ggt ggg 48 Met Asn Gln Leu Ile Glu Asn Val Asn Leu Gln Lys Leu Arg Gly Gly 1 5 10 15 tat tac acc cct aaa gtt att gct gac ttt tta tgt caa tgg agt att 96 Tyr Tyr Thr Pro Lys Val Ile Ala Asp Phe Leu Cys Gln Trp Ser Ile 20 25 30 caa gat gac aca aag agt gta ctt gaa ccc agt tgt gga gat ggt aat 144 Gln Asp Asp Thr Lys Ser Val Leu Glu Pro Ser Cys Gly Asp Gly Asn 35 40 45 ttt att gaa tcg gca ata ctt agg ttc aaa gaa ctt agt ata gat aat 192 Phe Ile Glu Ser Ala Ile Leu Arg Phe Lys Glu Leu Ser Ile Asp Asn 50 55 60 gaa caa ctt aaa gga aga att aca gga gta gag cta att gaa gaa gaa 240 Glu Gln Leu Lys Gly Arg Ile Thr Gly Val Glu Leu Ile Glu Glu Glu 65 70 75 80 gct ttg aaa gtt caa aat cga gca aat gag ttg ggg gtt gat aaa aac 288 Ala Leu Lys Val Gln Asn Arg Ala Asn Glu Leu Gly Val Asp Lys Asn 85 90 95 tca ata gta aat agt gac ttc ttt caa ttt gta aaa gat aat aag aat 336 Ser Ile Val Asn Ser Asp Phe Phe Gln Phe Val Lys Asp Asn Lys Asn 100 105 110 aaa aaa ttt gat act att att ggt aat cca cca ttc ata aga tac caa 384 Lys Lys Phe Asp Thr Ile Ile Gly Asn Pro Pro Phe Ile Arg Tyr Gln 115 120 125 aac ttt cct gaa gag cat cgt agt ata gcc atg gaa atg atg gag gaa 432 Asn Phe Pro Glu Glu His Arg Ser Ile Ala Met Glu Met Met Glu Glu 130 135 140 cta ggt tta aaa cct aat aaa ctt aca aat atc tgg gtt cca ttt cta 480 Leu Gly Leu Lys Pro Asn Lys Leu Thr Asn Ile Trp Val Pro Phe Leu 145 150 155 160 gtg gta tct gct aca tta ctt aat gaa caa gga aag atg gct atg gtt 528 Val Val Ser Ala Thr Leu Leu Asn Glu Gln Gly Lys Met Ala Met Val 165 170 175 ata ccg gct gaa tta ttt cag gta aag tat gca gca gaa aca aga att 576 Ile Pro Ala Glu Leu Phe Gln Val Lys Tyr Ala Ala Glu Thr Arg Ile 180 185 190 ttt tta tca aag ttt ttc gat cgt atc act ata att aca ttt gaa aaa 624 Phe Leu Ser Lys Phe Phe Asp Arg Ile Thr Ile Ile Thr Phe Glu Lys 195 200 205 ctt gtt ttt gaa aat atc caa cag gaa gtt ata cta ctt ctt tgt gaa 672 Leu Val Phe Glu Asn Ile Gln Gln Glu Val Ile Leu Leu Leu Cys Glu 210 215 220 aag aaa gtt aat aaa ggt aaa gga att cgg gtt att gaa tgc gag aac 720 Lys Lys Val Asn Lys Gly Lys Gly Ile Arg Val Ile Glu Cys Glu Asn 225 230 235 240 tta gat gga tta aat tcc att gat ttt gta gct ata aat ggt tca aat 768 Leu Asp Gly Leu Asn Ser Ile Asp Phe Val Ala Ile Asn Gly Ser Asn 245 250 255 gtt aaa cct att gaa cac cgt act gaa aag tgg aca aag tat ttc tta 816 Val Lys Pro Ile Glu His Arg Thr Glu Lys Trp Thr Lys Tyr Phe Leu 260 265 270 aac gaa gat gaa ata ctt ctt tta cag agt tta aag gaa gac aaa cgc 864 Asn Glu Asp Glu Ile Leu Leu Leu Gln Ser Leu Lys Glu Asp Lys Arg 275 280 285 gtt aaa aat tgt aat gac tat ttt aag aca gaa gtt ggc tta gtt act 912 Val Lys Asn Cys Asn Asp Tyr Phe Lys Thr Glu Val Gly Leu Val Thr 290 295 300 gga cga aac gaa ttc ttt atg atg aaa gaa aac caa gta aaa gaa tgg 960 Gly Arg Asn Glu Phe Phe Met Met Lys Glu Asn Gln Val Lys Glu Trp 305 310 315 320 aat cta gaa gaa tat aca ata cct gtt aca ggt agg tcc aat cag tta 1008 Asn Leu Glu Glu Tyr Thr Ile Pro Val Thr Gly Arg Ser Asn Gln Leu 325 330 335 aaa ggt ata aca ttt aca gaa aat gat ttt cat gaa aat tca atg gaa 1056 Lys Gly Ile Thr Phe Thr Glu Asn Asp Phe His Glu Asn Ser Met Glu 340 345 350 caa aag gca att cac cta ttt ttg cca cca gat gaa gat ttt gaa aag 1104 Gln Lys Ala Ile His Leu Phe Leu Pro Pro Asp Glu Asp Phe Glu Lys 355 360 365 tta ccg att gag tgt caa aat tat atc aag tat ggg gaa gaa aaa ggc 1152 Leu Pro Ile Glu Cys Gln Asn Tyr Ile Lys Tyr Gly Glu Glu Lys Gly 370 375 380 ttc cat caa ggc tat aaa acc aga att aga aaa cgt tgg tat ata act 1200 Phe His Gln Gly Tyr Lys Thr Arg Ile Arg Lys Arg Trp Tyr Ile Thr 385 390 395 400 cca tct aga tgg gtt cca gat gct ttt gct tta aga cag gtt gat ggc 1248 Pro Ser Arg Trp Val Pro Asp Ala Phe Ala Leu Arg Gln Val Asp Gly 405 410 415 tat cca aaa cta att tta aat gaa acc gac gct tct tct act gat aca 1296 Tyr Pro Lys Leu Ile Leu Asn Glu Thr Asp Ala Ser Ser Thr Asp Thr 420 425 430 att cat agg gtt aga ttt aaa gaa ggt ata aat gaa aag tta gcc gta 1344 Ile His Arg Val Arg Phe Lys Glu Gly Ile Asn Glu Lys Leu Ala Val 435 440 445 gtt tca ttt ttg aac tca ctc act ttt gca tct tca gaa ata acg ggg 1392 Val Ser Phe Leu Asn Ser Leu Thr Phe Ala Ser Ser Glu Ile Thr Gly 450 455 460 aga agt tat ggt ggt ggt gtt atg aca ttc gaa cca act gaa att gga 1440 Arg Ser Tyr Gly Gly Gly Val Met Thr Phe Glu Pro Thr Glu Ile Gly 465 470 475 480 gaa atc cta ata cct tcc ttt gat aac tta tcc att gat ttt gat aaa 1488 Glu Ile Leu Ile Pro Ser Phe Asp Asn Leu Ser Ile Asp Phe Asp Lys 485 490 495 att gat gcc tta att cga gaa aag gag att gaa aaa gtc ctt gat att 1536 Ile Asp Ala Leu Ile Arg Glu Lys Glu Ile Glu Lys Val Leu Asp Ile 500 505 510 gtt gat gaa gct tta ctt ata aaa tat cat ggg ttt agt gag aaa gaa 1584 Val Asp Glu Ala Leu Leu Ile Lys Tyr His Gly Phe Ser Glu Lys Glu 515 520 525 gta aaa cag ctt cga ggg ata tgg aag aaa ctt tct cag aga aga aac 1632 Val Lys Gln Leu Arg Gly Ile Trp Lys Lys Leu Ser Gln Arg Arg Asn 530 535 540 aat aga acg aag aaa taa 1650 Asn Arg Thr Lys Lys 545 550 2 549 PRT Bacillus pumilus 2 Met Asn Gln Leu Ile Glu Asn Val Asn Leu Gln Lys Leu Arg Gly Gly 1 5 10 15 Tyr Tyr Thr Pro Lys Val Ile Ala Asp Phe Leu Cys Gln Trp Ser Ile 20 25 30 Gln Asp Asp Thr Lys Ser Val Leu Glu Pro Ser Cys Gly Asp Gly Asn 35 40 45 Phe Ile Glu Ser Ala Ile Leu Arg Phe Lys Glu Leu Ser Ile Asp Asn 50 55 60 Glu Gln Leu Lys Gly Arg Ile Thr Gly Val Glu Leu Ile Glu Glu Glu 65 70 75 80 Ala Leu Lys Val Gln Asn Arg Ala Asn Glu Leu Gly Val Asp Lys Asn 85 90 95 Ser Ile Val Asn Ser Asp Phe Phe Gln Phe Val Lys Asp Asn Lys Asn 100 105 110 Lys Lys Phe Asp Thr Ile Ile Gly Asn Pro Pro Phe Ile Arg Tyr Gln 115 120 125 Asn Phe Pro Glu Glu His Arg Ser Ile Ala Met Glu Met Met Glu Glu 130 135 140 Leu Gly Leu Lys Pro Asn Lys Leu Thr Asn Ile Trp Val Pro Phe Leu 145 150 155 160 Val Val Ser Ala Thr Leu Leu Asn Glu Gln Gly Lys Met Ala Met Val 165 170 175 Ile Pro Ala Glu Leu Phe Gln Val Lys Tyr Ala Ala Glu Thr Arg Ile 180 185 190 Phe Leu Ser Lys Phe Phe Asp Arg Ile Thr Ile Ile Thr Phe Glu Lys 195 200 205 Leu Val Phe Glu Asn Ile Gln Gln Glu Val Ile Leu Leu Leu Cys Glu 210 215 220 Lys Lys Val Asn Lys Gly Lys Gly Ile Arg Val Ile Glu Cys Glu Asn 225 230 235 240 Leu Asp Gly Leu Asn Ser Ile Asp Phe Val Ala Ile Asn Gly Ser Asn 245 250 255 Val Lys Pro Ile Glu His Arg Thr Glu Lys Trp Thr Lys Tyr Phe Leu 260 265 270 Asn Glu Asp Glu Ile Leu Leu Leu Gln Ser Leu Lys Glu Asp Lys Arg 275 280 285 Val Lys Asn Cys Asn Asp Tyr Phe Lys Thr Glu Val Gly Leu Val Thr 290 295 300 Gly Arg Asn Glu Phe Phe Met Met Lys Glu Asn Gln Val Lys Glu Trp 305 310 315 320 Asn Leu Glu Glu Tyr Thr Ile Pro Val Thr Gly Arg Ser Asn Gln Leu 325 330 335 Lys Gly Ile Thr Phe Thr Glu Asn Asp Phe His Glu Asn Ser Met Glu 340 345 350 Gln Lys Ala Ile His Leu Phe Leu Pro Pro Asp Glu Asp Phe Glu Lys 355 360 365 Leu Pro Ile Glu Cys Gln Asn Tyr Ile Lys Tyr Gly Glu Glu Lys Gly 370 375 380 Phe His Gln Gly Tyr Lys Thr Arg Ile Arg Lys Arg Trp Tyr Ile Thr 385 390 395 400 Pro Ser Arg Trp Val Pro Asp Ala Phe Ala Leu Arg Gln Val Asp Gly 405 410 415 Tyr Pro Lys Leu Ile Leu Asn Glu Thr Asp Ala Ser Ser Thr Asp Thr 420 425 430 Ile His Arg Val Arg Phe Lys Glu Gly Ile Asn Glu Lys Leu Ala Val 435 440 445 Val Ser Phe Leu Asn Ser Leu Thr Phe Ala Ser Ser Glu Ile Thr Gly 450 455 460 Arg Ser Tyr Gly Gly Gly Val Met Thr Phe Glu Pro Thr Glu Ile Gly 465 470 475 480 Glu Ile Leu Ile Pro Ser Phe Asp Asn Leu Ser Ile Asp Phe Asp Lys 485 490 495 Ile Asp Ala Leu Ile Arg Glu Lys Glu Ile Glu Lys Val Leu Asp Ile 500 505 510 Val Asp Glu Ala Leu Leu Ile Lys Tyr His Gly Phe Ser Glu Lys Glu 515 520 525 Val Lys Gln Leu Arg Gly Ile Trp Lys Lys Leu Ser Gln Arg Arg Asn 530 535 540 Asn Arg Thr Lys Lys 545 3 3030 DNA Bacillus pumilus CDS (1)..(3030) 3 atg cat ata agt gag tta gta gat aaa tac aaa gcg cat aga agt act 48 Met His Ile Ser Glu Leu Val Asp Lys Tyr Lys Ala His Arg Ser Thr 1 5 10 15 ttt tta aaa cca act tat aat gaa act caa cta agg aat gat ttt ata 96 Phe Leu Lys Pro Thr Tyr Asn Glu Thr Gln Leu Arg Asn Asp Phe Ile 20 25 30 gac cca ctt cta aaa tct tta gga tgg gat gtt gat aat acc aaa gga 144 Asp Pro Leu Leu Lys Ser Leu Gly Trp Asp Val Asp Asn Thr Lys Gly 35 40 45 aaa aca cat att cta aga gat gtc att caa gaa gaa tac ata gaa ata 192 Lys Thr His Ile Leu Arg Asp Val Ile Gln Glu Glu Tyr Ile Glu Ile 50 55 60 aaa gat gag gag aca aag aaa aat cca gat tat aca ctt cgt ata aac 240 Lys Asp Glu Glu Thr Lys Lys Asn Pro Asp Tyr Thr Leu Arg Ile Asn 65 70 75 80 ggt acg aga aag ctg ttt gta gag gtt aag aaa ccg tct ttt aat att 288 Gly Thr Arg Lys Leu Phe Val Glu Val Lys Lys Pro Ser Phe Asn Ile 85 90 95 ttg aaa tca gct aaa gca gcc ttc caa aca aga aga tat ggt tgg agt 336 Leu Lys Ser Ala Lys Ala Ala Phe Gln Thr Arg Arg Tyr Gly Trp Ser 100 105 110 gct aac ctt ggt att tca gta ctt aca aat ttc gag cat cta gtt att 384 Ala Asn Leu Gly Ile Ser Val Leu Thr Asn Phe Glu His Leu Val Ile 115 120 125 tat gat tgt aga tat acg cct gac aaa tcc gac aat gaa cat att gct 432 Tyr Asp Cys Arg Tyr Thr Pro Asp Lys Ser Asp Asn Glu His Ile Ala 130 135 140 aga tat aaa gtt ttc tct tac gag gaa tat gaa gaa gca ttt gat gaa 480 Arg Tyr Lys Val Phe Ser Tyr Glu Glu Tyr Glu Glu Ala Phe Asp Glu 145 150 155 160 ata aag gat ata att tca tat gag tca gcc aac tca ggt gct ctg gac 528 Ile Lys Asp Ile Ile Ser Tyr Glu Ser Ala Asn Ser Gly Ala Leu Asp 165 170 175 gaa atg ttt gat gta aat aca aga gtt ggt gaa acg ttt gac gag tat 576 Glu Met Phe Asp Val Asn Thr Arg Val Gly Glu Thr Phe Asp Glu Tyr 180 185 190 ttt tta cag caa att gag aat tgg cgc gaa aag cta gct aaa act gca 624 Phe Leu Gln Gln Ile Glu Asn Trp Arg Glu Lys Leu Ala Lys Thr Ala 195 200 205 att aaa aat aac acc gaa tta ggt gaa gag gac gtc aat ttt att gtc 672 Ile Lys Asn Asn Thr Glu Leu Gly Glu Glu Asp Val Asn Phe Ile Val 210 215 220 caa aga cta tta aac aga att att ttt ctt aga gtt tgt gaa gat aga 720 Gln Arg Leu Leu Asn Arg Ile Ile Phe Leu Arg Val Cys Glu Asp Arg 225 230 235 240 acc att gaa aaa tat gaa aca att aaa agt ata aaa aac tat gag gaa 768 Thr Ile Glu Lys Tyr Glu Thr Ile Lys Ser Ile Lys Asn Tyr Glu Glu 245 250 255 tta aaa gat ctg ttt caa aag tct gat agg aaa ttt aat tca ggt ctc 816 Leu Lys Asp Leu Phe Gln Lys Ser Asp Arg Lys Phe Asn Ser Gly Leu 260 265 270 ttt gac ttc ata gat gat acg ctc ttg ctt gag gtt gaa att gat tcg 864 Phe Asp Phe Ile Asp Asp Thr Leu Leu Leu Glu Val Glu Ile Asp Ser 275 280 285 aat gta ttg ata gaa att ttt agt gat tta tat ttc cca caa agc cca 912 Asn Val Leu Ile Glu Ile Phe Ser Asp Leu Tyr Phe Pro Gln Ser Pro 290 295 300 tat gat ttt tct gtt gtc gat cca aca ata tta agc cag ata tat gaa 960 Tyr Asp Phe Ser Val Val Asp Pro Thr Ile Leu Ser Gln Ile Tyr Glu 305 310 315 320 cgt ttt cta ggt caa gaa ata att ata gag tca ggt ggt aca ttt cac 1008 Arg Phe Leu Gly Gln Glu Ile Ile Ile Glu Ser Gly Gly Thr Phe His 325 330 335 att acg gag tca cca gaa gtt gcg gcg tcc aat ggt gtt gtt cca act 1056 Ile Thr Glu Ser Pro Glu Val Ala Ala Ser Asn Gly Val Val Pro Thr 340 345 350 cca aaa att atc gtc gaa cag ata gtg aaa gac act tta acg ccc ctt 1104 Pro Lys Ile Ile Val Glu Gln Ile Val Lys Asp Thr Leu Thr Pro Leu 355 360 365 acg gaa ggc aaa aaa ttt aat gag cta tgt aac tta aaa ata gca gat 1152 Thr Glu Gly Lys Lys Phe Asn Glu Leu Cys Asn Leu Lys Ile Ala Asp 370 375 380 ata tgt tgt gga tca gga act ttc cta att tca agt tat gac ttt cta 1200 Ile Cys Cys Gly Ser Gly Thr Phe Leu Ile Ser Ser Tyr Asp Phe Leu 385 390 395 400 gta gag aaa gta atg gaa aag ata ata gaa gag aac atc gat gat tca 1248 Val Glu Lys Val Met Glu Lys Ile Ile Glu Glu Asn Ile Asp Asp Ser 405 410 415 gat tta gta tat gaa act gaa gaa ggg cta att ttg aca ctt aaa gca 1296 Asp Leu Val Tyr Glu Thr Glu Glu Gly Leu Ile Leu Thr Leu Lys Ala 420 425 430 aaa aga aat atc ttg gag aat aat ttg ttt ggt gtt gat gtt aat cca 1344 Lys Arg Asn Ile Leu Glu Asn Asn Leu Phe Gly Val Asp Val Asn Pro 435 440 445 tac gct gtt gaa gta gct gag ttc agt tta tta tta aag cta tta gaa 1392 Tyr Ala Val Glu Val Ala Glu Phe Ser Leu Leu Leu Lys Leu Leu Glu 450 455 460 ggt gag aat gag gca tcg gtt aat aat ttc att cac gag cat gag gat 1440 Gly Glu Asn Glu Ala Ser Val Asn Asn Phe Ile His Glu His Glu Asp 465 470 475 480 aaa ata tta ccg gat tta aca tct att att aaa tgt gga aac agc tta 1488 Lys Ile Leu Pro Asp Leu Thr Ser Ile Ile Lys Cys Gly Asn Ser Leu 485 490 495 gta gat aat aag ttt ttt gaa ttc atg cca gaa tcg tta gag gac gat 1536 Val Asp Asn Lys Phe Phe Glu Phe Met Pro Glu Ser Leu Glu Asp Asp 500 505 510 gaa atc tta ttt aag gct aat cca ttt gaa tgg gaa gag gag ttt cca 1584 Glu Ile Leu Phe Lys Ala Asn Pro Phe Glu Trp Glu Glu Glu Phe Pro 515 520 525 gat att atg gca aat ggt ggc ttt gat gct att ata gga aat cca cct 1632 Asp Ile Met Ala Asn Gly Gly Phe Asp Ala Ile Ile Gly Asn Pro Pro 530 535 540 tat gtt cga ata cag aac atg aaa aaa tat agt cct gag gaa att gaa 1680 Tyr Val Arg Ile Gln Asn Met Lys Lys Tyr Ser Pro Glu Glu Ile Glu 545 550 555 560 tat tat caa tca aaa gac tct gaa tat act gtt gca aaa aaa gaa aca 1728 Tyr Tyr Gln Ser Lys Asp Ser Glu Tyr Thr Val Ala Lys Lys Glu Thr 565 570 575 gtt gac aag tat ttt tta ttt att gag aga gca tta ata tta ctc aat 1776 Val Asp Lys Tyr Phe Leu Phe Ile Glu Arg Ala Leu Ile Leu Leu Asn 580 585 590 cct act ggg ctg ttg ggt tat ata ata ccg cat aaa ttc ttt att aca 1824 Pro Thr Gly Leu Leu Gly Tyr Ile Ile Pro His Lys Phe Phe Ile Thr 595 600 605 aaa ggt ggt aag gaa cta aga aag ttc ata gct gaa aaa cat caa ata 1872 Lys Gly Gly Lys Glu Leu Arg Lys Phe Ile Ala Glu Lys His Gln Ile 610 615 620 tca aaa att ata aat ttt ggt gtt aca cag gtc ttt cca gga aga gcg 1920 Ser Lys Ile Ile Asn Phe Gly Val Thr Gln Val Phe Pro Gly Arg Ala 625 630 635 640 aca tat acg gct att tta att atc caa gca aat aaa atg gca cag ttc 1968 Thr Tyr Thr Ala Ile Leu Ile Ile Gln Ala Asn Lys Met Ala Gln Phe 645 650 655 aag tat aag aaa gta agt aat ata tca gca gaa acc cta gat tct gaa 2016 Lys Tyr Lys Lys Val Ser Asn Ile Ser Ala Glu Thr Leu Asp Ser Glu 660 665 670 gaa aat acg tgt gtt tat agc tca gaa aag tat aat tct gac cct tgg 2064 Glu Asn Thr Cys Val Tyr Ser Ser Glu Lys Tyr Asn Ser Asp Pro Trp 675 680 685 ata ttt tta tct cct gaa aca gaa gct gtt ttt act aaa ttt aca gaa 2112 Ile Phe Leu Ser Pro Glu Thr Glu Ala Val Phe Thr Lys Phe Thr Glu 690 695 700 gct caa ttt gag aaa ctt gga gaa atc act gat ata agt gta gga cta 2160 Ala Gln Phe Glu Lys Leu Gly Glu Ile Thr Asp Ile Ser Val Gly Leu 705 710 715 720 caa aca agc gct gat aaa ata tat att ttt att cct gaa aat gaa act 2208 Gln Thr Ser Ala Asp Lys Ile Tyr Ile Phe Ile Pro Glu Asn Glu Thr 725 730 735 tca gat aca tat ata ttt aat tat aaa ggg aaa aga tat gaa ata gaa 2256 Ser Asp Thr Tyr Ile Phe Asn Tyr Lys Gly Lys Arg Tyr Glu Ile Glu 740 745 750 aaa tct ata tgt tgc cca gct atc tat gac tta tct ttt ggt tct ttt 2304 Lys Ser Ile Cys Cys Pro Ala Ile Tyr Asp Leu Ser Phe Gly Ser Phe 755 760 765 gaa agc att cag gga aat gca caa atg ata ttc cct tat gaa atc aga 2352 Glu Ser Ile Gln Gly Asn Ala Gln Met Ile Phe Pro Tyr Glu Ile Arg 770 775 780 gat gaa gaa gca tat cta cta gag gaa gaa acg ctt gaa aat gat tat 2400 Asp Glu Glu Ala Tyr Leu Leu Glu Glu Glu Thr Leu Glu Asn Asp Tyr 785 790 795 800 cct ctt gct tgg aat tat ttg aat gag ttt aaa gaa gct ctt gaa aaa 2448 Pro Leu Ala Trp Asn Tyr Leu Asn Glu Phe Lys Glu Ala Leu Glu Lys 805 810 815 aga agc tta caa ggc cgt aat ccg aaa tgg tat caa tat ggt cgg tcc 2496 Arg Ser Leu Gln Gly Arg Asn Pro Lys Trp Tyr Gln Tyr Gly Arg Ser 820 825 830 caa agt tta tca aaa ttt cat gat aaa gaa aaa ctg ata tgg acc gta 2544 Gln Ser Leu Ser Lys Phe His Asp Lys Glu Lys Leu Ile Trp Thr Val 835 840 845 ctt gct acg aaa ccc ccg tat gta ctt gat agg aat aac ctg tta ttt 2592 Leu Ala Thr Lys Pro Pro Tyr Val Leu Asp Arg Asn Asn Leu Leu Phe 850 855 860 act ggt ggt gga aac gga ccg tat tat ggt tta att aac caa tct att 2640 Thr Gly Gly Gly Asn Gly Pro Tyr Tyr Gly Leu Ile Asn Gln Ser Ile 865 870 875 880 tac tct ttg cat tat ttt tta ggt att ctt tca cat cct gta ata gaa 2688 Tyr Ser Leu His Tyr Phe Leu Gly Ile Leu Ser His Pro Val Ile Glu 885 890 895 agt atg gta aaa gca agg gcc agt gaa ttt agg gga tca tat tat tct 2736 Ser Met Val Lys Ala Arg Ala Ser Glu Phe Arg Gly Ser Tyr Tyr Ser 900 905 910 cat gga aaa caa ttt att gag aaa atc cca att aga aag att gat ttt 2784 His Gly Lys Gln Phe Ile Glu Lys Ile Pro Ile Arg Lys Ile Asp Phe 915 920 925 gat gat caa gat gag gta gac aaa tat aat acg gtg gtc aca aca gta 2832 Asp Asp Gln Asp Glu Val Asp Lys Tyr Asn Thr Val Val Thr Thr Val 930 935 940 gaa aaa tta att ata act acc gat aga att aaa agt gag agc aat gga 2880 Glu Lys Leu Ile Ile Thr Thr Asp Arg Ile Lys Ser Glu Ser Asn Gly 945 950 955 960 ccc cgg agg aga atg tta aga aga agg tta gat gct ttg tct aat caa 2928 Pro Arg Arg Arg Met Leu Arg Arg Arg Leu Asp Ala Leu Ser Asn Gln 965 970 975 ctt atc cag gtt att aat gaa ctt tat aat atc agt gac gaa gaa tat 2976 Leu Ile Gln Val Ile Asn Glu Leu Tyr Asn Ile Ser Asp Glu Glu Tyr 980 985 990 acg aca gtt ttg aat gat gaa atg ttg aca gcg gcg tta gga gaa gaa 3024 Thr Thr Val Leu Asn Asp Glu Met Leu Thr Ala Ala Leu Gly Glu Glu 995 1000 1005 aaa tga 3030 Lys 1010 4 1009 PRT Bacillus pumilus 4 Met His Ile Ser Glu Leu Val Asp Lys Tyr Lys Ala His Arg Ser Thr 1 5 10 15 Phe Leu Lys Pro Thr Tyr Asn Glu Thr Gln Leu Arg Asn Asp Phe Ile 20 25 30 Asp Pro Leu Leu Lys Ser Leu Gly Trp Asp Val Asp Asn Thr Lys Gly 35 40 45 Lys Thr His Ile Leu Arg Asp Val Ile Gln Glu Glu Tyr Ile Glu Ile 50 55 60 Lys Asp Glu Glu Thr Lys Lys Asn Pro Asp Tyr Thr Leu Arg Ile Asn 65 70 75 80 Gly Thr Arg Lys Leu Phe Val Glu Val Lys Lys Pro Ser Phe Asn Ile 85 90 95 Leu Lys Ser Ala Lys Ala Ala Phe Gln Thr Arg Arg Tyr Gly Trp Ser 100 105 110 Ala Asn Leu Gly Ile Ser Val Leu Thr Asn Phe Glu His Leu Val Ile 115 120 125 Tyr Asp Cys Arg Tyr Thr Pro Asp Lys Ser Asp Asn Glu His Ile Ala 130 135 140 Arg Tyr Lys Val Phe Ser Tyr Glu Glu Tyr Glu Glu Ala Phe Asp Glu 145 150 155 160 Ile Lys Asp Ile Ile Ser Tyr Glu Ser Ala Asn Ser Gly Ala Leu Asp 165 170 175 Glu Met Phe Asp Val Asn Thr Arg Val Gly Glu Thr Phe Asp Glu Tyr 180 185 190 Phe Leu Gln Gln Ile Glu Asn Trp Arg Glu Lys Leu Ala Lys Thr Ala 195 200 205 Ile Lys Asn Asn Thr Glu Leu Gly Glu Glu Asp Val Asn Phe Ile Val 210 215 220 Gln Arg Leu Leu Asn Arg Ile Ile Phe Leu Arg Val Cys Glu Asp Arg 225 230 235 240 Thr Ile Glu Lys Tyr Glu Thr Ile Lys Ser Ile Lys Asn Tyr Glu Glu 245 250 255 Leu Lys Asp Leu Phe Gln Lys Ser Asp Arg Lys Phe Asn Ser Gly Leu 260 265 270 Phe Asp Phe Ile Asp Asp Thr Leu Leu Leu Glu Val Glu Ile Asp Ser 275 280 285 Asn Val Leu Ile Glu Ile Phe Ser Asp Leu Tyr Phe Pro Gln Ser Pro 290 295 300 Tyr Asp Phe Ser Val Val Asp Pro Thr Ile Leu Ser Gln Ile Tyr Glu 305 310 315 320 Arg Phe Leu Gly Gln Glu Ile Ile Ile Glu Ser Gly Gly Thr Phe His 325 330 335 Ile Thr Glu Ser Pro Glu Val Ala Ala Ser Asn Gly Val Val Pro Thr 340 345 350 Pro Lys Ile Ile Val Glu Gln Ile Val Lys Asp Thr Leu Thr Pro Leu 355 360 365 Thr Glu Gly Lys Lys Phe Asn Glu Leu Cys Asn Leu Lys Ile Ala Asp 370 375 380 Ile Cys Cys Gly Ser Gly Thr Phe Leu Ile Ser Ser Tyr Asp Phe Leu 385 390 395 400 Val Glu Lys Val Met Glu Lys Ile Ile Glu Glu Asn Ile Asp Asp Ser 405 410 415 Asp Leu Val Tyr Glu Thr Glu Glu Gly Leu Ile Leu Thr Leu Lys Ala 420 425 430 Lys Arg Asn Ile Leu Glu Asn Asn Leu Phe Gly Val Asp Val Asn Pro 435 440 445 Tyr Ala Val Glu Val Ala Glu Phe Ser Leu Leu Leu Lys Leu Leu Glu 450 455 460 Gly Glu Asn Glu Ala Ser Val Asn Asn Phe Ile His Glu His Glu Asp 465 470 475 480 Lys Ile Leu Pro Asp Leu Thr Ser Ile Ile Lys Cys Gly Asn Ser Leu 485 490 495 Val Asp Asn Lys Phe Phe Glu Phe Met Pro Glu Ser Leu Glu Asp Asp 500 505 510 Glu Ile Leu Phe Lys Ala Asn Pro Phe Glu Trp Glu Glu Glu Phe Pro 515 520 525 Asp Ile Met Ala Asn Gly Gly Phe Asp Ala Ile Ile Gly Asn Pro Pro 530 535 540 Tyr Val Arg Ile Gln Asn Met Lys Lys Tyr Ser Pro Glu Glu Ile Glu 545 550 555 560 Tyr Tyr Gln Ser Lys Asp Ser Glu Tyr Thr Val Ala Lys Lys Glu Thr 565 570 575 Val Asp Lys Tyr Phe Leu Phe Ile Glu Arg Ala Leu Ile Leu Leu Asn 580 585 590 Pro Thr Gly Leu Leu Gly Tyr Ile Ile Pro His Lys Phe Phe Ile Thr 595 600 605 Lys Gly Gly Lys Glu Leu Arg Lys Phe Ile Ala Glu Lys His Gln Ile 610 615 620 Ser Lys Ile Ile Asn Phe Gly Val Thr Gln Val Phe Pro Gly Arg Ala 625 630 635 640 Thr Tyr Thr Ala Ile Leu Ile Ile Gln Ala Asn Lys Met Ala Gln Phe 645 650 655 Lys Tyr Lys Lys Val Ser Asn Ile Ser Ala Glu Thr Leu Asp Ser Glu 660 665 670 Glu Asn Thr Cys Val Tyr Ser Ser Glu Lys Tyr Asn Ser Asp Pro Trp 675 680 685 Ile Phe Leu Ser Pro Glu Thr Glu Ala Val Phe Thr Lys Phe Thr Glu 690 695 700 Ala Gln Phe Glu Lys Leu Gly Glu Ile Thr Asp Ile Ser Val Gly Leu 705 710 715 720 Gln Thr Ser Ala Asp Lys Ile Tyr Ile Phe Ile Pro Glu Asn Glu Thr 725 730 735 Ser Asp Thr Tyr Ile Phe Asn Tyr Lys Gly Lys Arg Tyr Glu Ile Glu 740 745 750 Lys Ser Ile Cys Cys Pro Ala Ile Tyr Asp Leu Ser Phe Gly Ser Phe 755 760 765 Glu Ser Ile Gln Gly Asn Ala Gln Met Ile Phe Pro Tyr Glu Ile Arg 770 775 780 Asp Glu Glu Ala Tyr Leu Leu Glu Glu Glu Thr Leu Glu Asn Asp Tyr 785 790 795 800 Pro Leu Ala Trp Asn Tyr Leu Asn Glu Phe Lys Glu Ala Leu Glu Lys 805 810 815 Arg Ser Leu Gln Gly Arg Asn Pro Lys Trp Tyr Gln Tyr Gly Arg Ser 820 825 830 Gln Ser Leu Ser Lys Phe His Asp Lys Glu Lys Leu Ile Trp Thr Val 835 840 845 Leu Ala Thr Lys Pro Pro Tyr Val Leu Asp Arg Asn Asn Leu Leu Phe 850 855 860 Thr Gly Gly Gly Asn Gly Pro Tyr Tyr Gly Leu Ile Asn Gln Ser Ile 865 870 875 880 Tyr Ser Leu His Tyr Phe Leu Gly Ile Leu Ser His Pro Val Ile Glu 885 890 895 Ser Met Val Lys Ala Arg Ala Ser Glu Phe Arg Gly Ser Tyr Tyr Ser 900 905 910 His Gly Lys Gln Phe Ile Glu Lys Ile Pro Ile Arg Lys Ile Asp Phe 915 920 925 Asp Asp Gln Asp Glu Val Asp Lys Tyr Asn Thr Val Val Thr Thr Val 930 935 940 Glu Lys Leu Ile Ile Thr Thr Asp Arg Ile Lys Ser Glu Ser Asn Gly 945 950 955 960 Pro Arg Arg Arg Met Leu Arg Arg Arg Leu Asp Ala Leu Ser Asn Gln 965 970 975 Leu Ile Gln Val Ile Asn Glu Leu Tyr Asn Ile Ser Asp Glu Glu Tyr 980 985 990 Thr Thr Val Leu Asn Asp Glu Met Leu Thr Ala Ala Leu Gly Glu Glu 995 1000 1005 Lys 5 24 DNA Bacillus pumilus 5 gtggaaacgg accgtattat ggtt 24 6 24 DNA Bacillus pumilus 6 caccagtaaa taacaggtta ttcc 24 7 27 DNA Bacillus pumilus 7 ttcgtagcaa gtacggtcca tatcagt 27 8 27 DNA Bacillus pumilus 8 ccgtatgtac ttgataggaa taacctg 27 9 24 DNA Bacillus pumilus 9 aggaactaag aaagttcata gctg 24 10 24 DNA Bacillus pumilus 10 atgcggtatt atataaccca acag 24 11 24 DNA Bacillus pumilus 11 tgacgtcctc ttcacctaat tcgg 24 12 24 DNA Bacillus pumilus 12 gagtttgtga agatagaacc attg 24 13 48 DNA Bacillus pumilus 13 agcggatccg gaggtaaata aatgaatcaa ttaattgaaa atgttaat 48 14 42 DNA Bacillus pumilus 14 aagggggcat gcttatactt atttcttcgt tctattgttt ct 42 15 51 DNA Bacillus pumilus 15 caaggatccg gaggtaaata aatgcatata agtgagttag tagataaata c 51 16 36 DNA Bacillus pumilus 16 ttaggatcct catttttctt ctcctaacgc cgctgt 36 17 54 DNA Bacillus pumilus 17 caccaatcta gaggaggtaa ataaatgcat ataagtgagt tagtagataa atac 54 18 42 DNA Bacillus pumilus 18 tgaaatctcg agttatcctg atccacaaca tatatctgct at 42 19 54 DNA unknown Synthetic primer 19 caccaatcta gaggaggtaa ataaatgcat ataagtgagt tagtagataa atac 54 20 39 DNA unknown Synthetic primer 20 gtttatacga agtgtataag ctggattttt ctttgtctc 39 21 39 DNA unknown Synthetic primer 21 gagacaaaga aaaatccagc ttatacactt cgtataaac 39 22 36 DNA unknown Synthetic primer 22 ttaggatcct catttttctt ctcctaacgc cgctgt 36 23 54 DNA unknown Synthetic primer 23 ggtggttcta gaggaggtaa ataaatgtct aatgaaaatt ataacattga tttc 54 24 39 DNA unknown Synthetic primer 24 ggtggtgagc tcctattgac ataatcgatc atcaagaag 39 25 42 DNA unknown Synthetic primer 25 atagggtgga ttgcctaata ttacatcaaa gccaccattt gc 42 26 40 DNA unknown Synthetic primer 26 tttgatgtaa tattaggcaa tccaccctat ataagaattc 40 

What is claimed is:
 1. A method for altering the cleavage specificity of a Type IIG restriction endonuclease, the Type IIG restriction endonuclease characterized by a cleavage domain adjacent to a methylase domain, the methylase domain located adjacent to a specificity domain, the method comprising: (a) ligating a first DNA sequence and a second DNA sequence to form a fusion DNA , wherein (i) the first DNA sequence comprises a DNA segment encoding a catalytic domain and an N-terminal portion of a methylase domain of a first Type IIG restriction endonuclease, and (ii) the second DNA sequence, comprises a DNA segment encoding a specificity domain and a C-terminal portion of a methylase domain of a second Type IIG restriction endonuclease; such that the ligation occurs between sequences encoding the methylase domain; and (b) transforming a host cell with the fusion DNA for expressing a Type IIG restriction endonuclease with altered cleavage specificity.
 2. A method according to claim 1, wherein step (a) further comprises: introducing a mutation into the cleavage domain to enhance the viability of the transformed host cell.
 3. A method according to claim 1, wherein the sequence corresponding to the N-terminal portion of the methylase terminates in a methylase conserved motif selected from motifs X, I, II, III, IV, V, VI, VII or VIII.
 4. A method according to claim 1, wherein the sequence corresponding to the C-terminal portion of the methylase terminates in a methylase conserved motif.
 5. A method according to claims 3 or 4 wherein the conserved motif is selected from motifs X, I, II, III, IV, V, VI, VII or VIII wherein the N-terminal portion and the C-terminal portion of the methylase are non-overlapping.
 6. A method according to claims 3 or 5, wherein the sequence corresponding to the N-terminal portion of the methylase motif terminates between the sequence encoding motif III and NPPY in motif IV.
 7. A method according to claim 1, wherein ligation occurs by means of a linker sequence attached to each of the N-terminal portion of the methylase domain and the C-terminal portion of the methylase domain on the first and second DNA segment.
 8. A method according to claim 1, wherein the fusion DNA encodes an active methylase domain.
 9. A method according to claim 1, wherein the first and second Type IIG endonucleases have defined cleavage and recognition sites.
 10. A method according to claim 1, wherein the first Type IIG endonuclease has a defined cleavage and recognition site and the second Type IIG endonuclease is characterized by a bioinformatic search of a microbial sequence database.
 11. A method for forming a non-natural, functional Type IIG restriction endonuclease, wherein the Type IIG restriction endonuclease is characterized by a functional cleavage domain, a functional methylase domain and an altered functional specificity domain, compared with a natural form of the functional Type IIG endonuclease, comprising: (a) inserting into a DNA encoding the methylase domain or the specificity domain of the natural form of the functional Type IIG endonuclease, a mutation or a nucleic acid linker sequence for inactivating optionally the cleavage domain and inactivating (i) the functional methylase domain and the specificity domain or (ii) the functional methylase domain or the functional specificity domain; (b) ligating to the DNA at the mutation or at the linker, a DNA encoding (i) a portion of the methylase and specificity domain or (ii) a portion of the methylase or specificity domain to form a fusion DNA; and (c) transforming a host cell having a marker for detecting a colony expressing a non-natural functional Type IIG restriction endonuclease.
 12. A method according to claim 11, wherein the mutation is positioned within a conserved motif in the methylase domain.
 13. A method according to claim 11, wherein the mutation is a deletion at a 5′-end of the DNA encoding the specificity domain.
 14. A method according to claim 11, wherein the mutation is a deletion within the specificity domain.
 15. A method according to claim 11, wherein the linker is a transposon mediated linker insertion sequence.
 16. A method according to claim 11, wherein the linker contains a restriction endonuclease cleavage site which is unique within the DNA encoding the restriction endonuclease. 